Automatic assessment of alignment quality

نویسندگان

  • Timo Lassmann
  • Erik L. L. Sonnhammer
چکیده

Multiple sequence alignments play a central role in the annotation of novel genomes. Given the biological and computational complexity of this task, the automatic generation of high-quality alignments remains challenging. Since multiple alignments are usually employed at the very start of data analysis pipelines, it is crucial to ensure high alignment quality. We describe a simple, yet elegant, solution to assess the biological accuracy of alignments automatically. Our approach is based on the comparison of several alignments of the same sequences. We introduce two functions to compare alignments: the average overlap score and the multiple overlap score. The former identifies difficult alignment cases by expressing the similarity among several alignments, while the latter estimates the biological correctness of individual alignments. We implemented both functions in the MUMSA program and demonstrate the overall robustness and accuracy of both functions on three large benchmark sets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

FAMSD: A powerful protein modeling platform that combines alignment methods, homology modeling, 3D structure quality estimation and molecular dynamics.

The prediction of a protein three-dimensional (3D) structure is one of the most important challenges in computational structural biology. We have developed an automatic protein 3D structure prediction method called FAMSD. FAMSD is based on a comparative modeling method which consists of the following four steps: (1) generating and selecting sequence alignments between target and template protei...

متن کامل

Kalign, Kalignvu and Mumsa: web servers for multiple sequence alignment

Obtaining high quality multiple alignments is crucial for a range of sequence analysis tasks. A common strategy is to align the sequences several times, varying the program or parameters until the best alignment according to manual inspection by human experts is found. Ideally, this should be assisted by an automatic assessment of the alignment quality. Our web-site http://msa.cgb.ki.se allows ...

متن کامل

Video Summary Quality Evaluation Based on 4C Assessment and User Interaction

As video summarization techniques have attracted increasing attention for efficient multimedia data management, quality evaluation of video summary is required. To address the lack of automatic evaluation techniques, this chapter proposes a novel full-reference evaluation framework to assess the quality of the video summary according to various user requirements. First, the reference video summ...

متن کامل

Optimal Strategies of Increasing Business Alignment, in Social Security Organization, with Quality Function Deployment (QFD) Approach

Considering the importance of the concept of strategic alignment of information technology (IT) in today economic organizations, this study attempted to extract the organization's IT strategies in order to increase the degree of strategic alignment and consequently the optimal strategies in the field of marketing and service delivery for social security organization. Using QFD technique and hie...

متن کامل

Measuring Word Alignment Quality for Statistical Machine Translation

Automatic word alignment plays a critical role in statistical machine translation. Unfortunately the relationship between alignment quality and statistical machine translation performance has not been well understood. In the recent literature the alignment task has frequently been decoupled from the translation task, and assumptions have been made about measuring alignment quality for machine t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Nucleic Acids Research

دوره 33  شماره 

صفحات  -

تاریخ انتشار 2005